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Abstract —The secrecy capacity of the type II wiretap channel 
(WTC II) with a noisy main channel is currently an open 
problem. Herein its secrecy-capacity is derived and shown to be 
equal to its semantic-security (SS) capacity. In this setting, the 
legitimate users communicate via a discrete-memoryless (DM) 
channel in the presence of an eavesdropper that has perfect access 
to a subset of its choosing of the transmitted symbols, constrained 
to a fixed fraction of the blocklength. The secrecy criterion 
is achieved simultaneously for all possible eavesdropper subset 
choices. The SS criterion demands negligible mutual information 
between the message and the eavesdropper’s observations even 
when maximized over all message distributions. 

A key tool for the achievability proof is a novel and stronger 
version of Wyner’s soft covering lemma. Specifically, a random 
codebook is shown to achieve the soft-covering phenomenon 
with high probability. The probability of failure is doubly- 
exponentially small in the blocklength. Since the combined 
number of messages and subsets grows only exponentially with 
the blocklength, SS for the WTC II is established by using the 
union bound and invoking the stronger soft-covering lemma. The 
direct proof shows that rates up to the weak-secrecy capacity 
of the classic WTC with a DM erasure channel (EC) to the 
eavesdropper are achievable. The converse follows by establishing 
the capacity of this DM wiretap EC as an upper bound for the 
WTC II. Erom a broader perspective, the stronger soft-covering 
lemma constitutes a tool for showing the existence of codebooks 
that satisfy exponentially many constraints, a beneficial ability 
for many other applications in information theoretic security. 

Index Terms —Erasure wiretap channel, information theoretic 
security, semantic-security, soft-covering lemma, wiretap channel 
of type II. 


I. Introduction 

Information theoretic security has adopted the weak-secrecy 
and the strong-secrecy metrics as a standard for measuring 
security. Respectively, weak-secrecy and strong-secrecy re¬ 
fer to the normalized and unnormalized mutual information 
between the secret message and the channel symbol string 
observed by the eavesdropper. However, recent work argues 
that, from a cryptographic point of view, both these metrics 
are insufficient to provide security of applications Q, El. 

The work of Z. Goldfeld and H. H. Permuter was supported by an ERC 
starting grant and the Cyber Security Reseai'ch Center (CSRC) at Ben-Gurion 
University of the Negev. The work of P. Cuff was supported by by the 
National Science Foundation (grant CCF-1350595) and the Air Force Office 
of Scientific Research (grant FA9550-15-1-0180). 

This paper was presented in part at the 2016 lEFE International Symposium on 
Information Theory, Barcelona, Spain, and in part 2016 IEEE CS International 
Conference on Software Science, Technology and Engineering, Beer-Sheva, 
Israel. 

Z. Goldfeld and H. H. Permuter are with the Department of Electrical 
and Computer Engineering, Ben-Gurion University of the Negev, Beer- 
Sheva, Israel (gziv@post.bgu.ac.il, haimp@bgu.ac.il). Paul Cuff is with the 
Department of Electrical Engineering, Princeton University, Princeton, NJ 
08544 USA (e-mail: cuff@princeton.edu). 


Their main drawback lies in the assumption that the message 
is random and uniformly distributed, as real-life messages are 
neither (messages may be files, votes or any type of structured 
data, often with low entropy). Semantic-security (SS) El, El 
is a cryptographic gold standard that was proposed in El as an 
adequate alternative and shown to be equivalent to a vanishing 
unnormalized mutual information for all message distributions. 
Adopting SS as our secrecy measure, we establish the SS- 
capacity of the wiretap channel of type II (WTC II) with a 
noisy main channel, for which even the secrecy-capacity was 
an open problem until now. On top of that, the SS-capacity 
and the strong-secrecy-capacity are shown to coincide. 

Secret communication over noisy channels dates back to 
Wyner who introduced the degraded wiretap channel (WTC) 
and derived its weak-secrecy-capacity 0. Csiszar and Korner 
extended Wyner’s result to the non-degraded WTC 0, which 
is henceforth referred to as the WTC I. A special instance 
of the WTC I is when the eavesdropper’s observation is an 
outcome of a discrete-memoryless (DM) erasure channel (EC), 
which essentially means that he observes a subset of the 
transmitted symbols which is chosen at random by nature. 
The WTC II was proposed by Ozarow and Wyner ||2l as 
a generalization of this instance, where a more powerful 
eavesdropper selects which subset to observe and security 
must hold versus all possible subset choices. Thus, the main 
challenge in establishing security for the WTC II boils down 
to finding a single sequence of codes that work well for 
each of the exponentially many subsets the eavesdropper may 
choose. In Q, the authors overcome this difficulty when the 
main channel is noiseless by relying on a unique randomized 
coset coding scheme in the proof of achievability. The derived 
rate-equivocation region was also shown to be tight, which 
solved the noiseless main channel scenario. The WTC II with 
a general (i.e., possibly noisy) DM main channel, however, 
remained an open problem ever since. 

A recent endeavor at the optimal secrecy rate of the WTC 
II with a noisy main channel was presented in 0 (see also 
0-112 for related work). Requiring a vanishing average 
error probability and security with respect to the weak-secrecy 
metric (namely, while assuming a uniformly distributed mes¬ 
sage and a normalized mutual information), the authors of 0 
extended the coset coding scheme from 0 to obtain an inner 
bound on the rate-equivocation region. An outer bound was 
also established by assuming that the subset the eavesdropper 
chooses to observe is revealed to all parties (i.e., to the 
legitimate users). Specializing these bounds to the maximal 
equivocation results in an inner and an outer bound on the 
weak-secrecy-capacity of a general WTC II; these bounds do 
not match. 
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In this work, we strengthen both the reliability and the 
security criteria, and derive the SS-capacity of the WTC II 
with a noisy main channel under a vanishing maximal error 
probability requirement. In the heart of the proof stands a 
stronger version of the soft-covering lemma which is key for 
the security analysis. Wyner’s original soft-covering lemma 
nsi Theorem 6.3] is a valuable tool for achievability proofs 
of information theoretic security m-ini, resolvability im, 
channel synthesis lfT9l . and source coding ll20l (see also ref¬ 
erences therein). The result herein sharpens the claim of soft- 
covering by moving away from an expected value analysis. 
Instead, we show that a random codebook achieves the soft- 
covering phenomenon with high probability. The probability 
of failure is doubly-exponentially small in the blocklength, 
enabling more powerful applications through the union bound. 
Specifically, the lemma lets one prove the existence of code¬ 
books that satisfy exponentially many secrecy-related con¬ 
straints, which, in turn, resolves the difficulty in the security 
analysis for the WTC II. 

As a simple preliminary application of the stronger soft- 
covering lemma, we derive the SS-capacity of the DM-WTC 
I under a maximal error probability requirement. In ll2Tll . this 
result was established in terms of source universal coding 
based on the expurgation technique (e.g., cf. Theorem 

7.7.1] ) for the broadcast channel with confidential messages 
0, which subsumes the WTC I as a special case. Efficient 
code constructions with polynomial complexity that achieve 
the SS-capacity under an average error probability constraint 
were presented in m for the DM scenario and in ll^ for the 
Gaussian case, while li24ll derived the Gaussian SS-capacity 
under a maximal error probability constraint. Complexity not 
being in the scope of this work, we focus on the funda¬ 
mental limits of semantically-secure communication and give 
an alternative proof of the WTC I SS-capacity based on the 
stronger soft-covering lemma and classic wiretap codes. Since 
the number of secret messages is only exponentially large, 
the double-exponential decay the lemma provides ensures SS 
with arbitrarily high probability. In other words, even though a 
codebook that satisfies exponentially many constraints related 
to soft-covering is required, the union bound yields that such 
a codebook exists. This code is then amended to be reliable 
with respect to the maximal error probability by relying on 
the well-known expurgation technique (e.g., cf. Il2^ Theorem 

7.7.1] ). 

Somewhat surprisingly, our optimal code construction for 
the WTC II is just the same. Here, SS involves a vanish¬ 
ing unnormalized mutual information (between the message 
and the eavesdropper’s observation), when maximized over 
all message distributions and eavesdropper’s subset choices. 
However, noting that their combined number grows only 
exponentially with the blocklenght, the stronger soft-covering 
lemma is still sharp enough to imply that the probability 
of an insecure random wiretap code is doubly-exponentially 
small. As for the WTC I, reliability is upgraded to account for 
maximal error probability using expurgation. The direct proof 
shows that any rate up to the weak-secrecy-capacity of the 


WTC I with a DM-EcQto the eavesdropper, is achievable. The 
converse follows by showing that the weak-secrecy-capacity 
of this WTC I upper bounds the SS-capacity of the WTC II. 
An important consequence of the WTC II SS-capacity proof is 
that Wyner’s wiretap codes for the erasure WTC I, are optimal. 
The binary version of these codes is, in fact, one of the few 
examples for which there are explicit constructions of practical 
secure encoders and decoders with optimal performance ll25]l . 

ESI. 

This paper is organized as follows. Section |II] provides 
definitions and basic properties. In Section [HI] we state the 
stronger soft-covering lemma and provide its proof. Section 
HV] describes the WTC I and gives an alternative stronger soft- 
covering lemma based derivation of its SS-capacity. In Section 
IVl we define the WTC II, state its SS-capacity and prove the 
result. Finally, Section [Vl] summarizes the main achievements 
and insights of this work. 

H. Notations and Preliminaries 

We use the following notations. Given two real numbers 
a,b, we denote by [a : b\ the set of integers {n S N|[a] < 
n < L^J}- We define K+ = {x € R|a; > 0}. Calligraphic 
letters denote sets, e.g., X, the complement of X is denoted by 
X'^, while \X\ stands for its cardinality. A”” denoted the n-fold 
Cartesian product of X. An element of X^ is denoted by x" = 
{xi,X 2 , ■ ■ ■ ,Xn)', whenever the dimension n is clear from the 
context, vectors (or sequences) are denoted by boldface letters, 
e.g., X. For any <S C [1 : n], we use = {xi)i^s to denote 
the substring of a;" defined by S, with respect to the natural 
ordering of S. For instance, if 5 = [* : j], where 1 < * < j < 
n, then x”^ = {xt, Xi+i,... ,Xj). 

Let (n, IF, P) be a probability space, where O is the sample 
space, F is the cr-algebra and P is the probability measure. 
Random variables over (fl, F, P) are denoted by uppercase 
letters, e.g., X, with similar conventions for random vectors. 
The probability of an event ^ is denoted by P(.4), while 
P(,/l|S) denotes conditional probability of A given B. We 
use 1a to denote the indicator function of A. The set of 
all probability mass functions (PMFs) on a finite set X is 
denoted by V{X). PMFs are denoted by the capital letter P, 
with a subscript that identifies the random variable and its 
possible conditioning. For example, for a discrete probability 
space (n, F, P) and two correlated random variables X and 
Y over that space, we use Px, Px,y and Px\y to denote, 
respectively, the marginal PMF of X, the joint PMF of {X, Y) 
and the conditional PMF of X given Y. In particular, Px\y 
represents the stochastic matrix whose elements are given by 
Px\Y{x\y) = P{X = x\Y = y). We omit subscripts if the 
arguments of the PMF are lowercase versions of the random 
variables. The support of a PMF P and the expectation of 
a random variable X are denoted by supp(P) and E[X], 
respectively. 

For a discrete measurable space (O, J^), a PMF Q e 'P(H) 
gives rise to a probability measure on {ll,F), which we 
denote by Pg; accordingly, Pq(, 4) = every 

'the erasure probability corresponds to the portion of symbols the eaves- 
dropper in the WTC II does not intercept 
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Fig. 1. Coding problem with the goal of making ^ Qy- 


A € J-. We use Eg to denote an expectation taken with 
respect to Pg. For a random variable X, we sometimes 
write Ex to emphasize that the expectation is taken with 
respect to Px- For a sequence of random variable W", if the 
entries of W” are drawn in an independent and identically 
distributed (i.i.d.) manner according to Px, then for every 
X G rf” we have Px"(x) = and we write 

Px"(x) = Px(x). Similarly, if for every (x, y) G A’” x 3^” 
we have PYr,\x^{y\x) = YYi=lPY\x{yi\x^), then we write 
Pyn|x-(y|x) = P^|_Y(y|x). We often use or Q”|^ 
when referring to an i.i.d. sequence of random variables. The 
conditional product PMF Qy\x a specific sequence 

X G A’" is denoted by Qy\x=x- 

The empirical PMF r-x of a sequence x G -T" is 


r/x(x) = 


7V(a:|x) 

n 


( 1 ) 


where iV(x|x) = J2i=i^{xi=x}- We use TJ^{Px) to denote 
the set of letter-typical sequences of length n with respect to 
the PMF Px and the non-negative number e llTTl Chapter 3], 
i.e., we have 


rriPx) = {x G 


Vxix)-Px{x)\ < ePxix), Vx G Xy 


( 2 ) 

The relative entropy between two probability measures P 
and Q on the same cr-algebra P of subsets of the sample space 
X, with P Q (i.e., P is absolutely continuous with respect 
to Q) is 


P(P||Q)=/^dPlog(^), 


(3) 


where ^ denotes the Radon-Nikodym derivative between P 
and Q. If the sample space X is countable, Q reduces to 


D{P\\Q) 


^ P{x) log 

a;Gsupp(P) 



(4) 


III. The Stronger Soft-Covering Lemma 

Wyner’s soft-covering lemma ini Theorem 6.3] states that 
the distribution induced by selecting a u-sequence at random 
from an appropriately chosen set C„ and passing it through a 
memoryless channel Qv\u, results in a good approximation 
of Qy in the limit of large n, as long as the set is of size 
\Bn\ = 2"^, where R > I(U;V) (Fig. [T]i. In fact, the set can 
be chosen quite carelessly - by random codebook construction, 
drawing each sequence independently from the distribution 

Qu- 

The soft-covering lemmas in the literature use a distance 
metric on distributions (commonly total variation or relative 
entropy) and claim that the distance between the induced 
distribution P^"^ and the desired distribution Qy vanishes 


in expectation over the random selection of the se{l. In 
the literature, Cl studies the fundamental limits of soft- 
covering as “resolvability”, ll 2 l provides rates of exponential 
convergence, 01 improves the exponents and extends the 
framework, 121 and Eol Chapter 16] refer to soft-covering 
simply as “covering” in the quantum context, El refers to 
it as a “sampling lemma” and points out that it holds for the 
stronger metric of relative entropy, and El gives a recent 
direct proof of the relative entropy result. 

Here we give a stronger claim. With high probability 
with respect to the set construction, the distance vanishes 
exponentially quickly with the blocklength n. The negligible 
probability of the random set not producing this desired result 
is doubly-exponentially small. 

Let W = [l : 2"^] and = {U(w)}^g^ be a set of 
random vectors that are i.i.d. according to Q^. We refer to B„ 
as the random codebook. Let = {u(w, Sn)}^gyy denote a 
realization of B„. For every fixed S„, the induced distribution 
is: 

Pi^-\x) = ^ Q-|y(v|u(u;,S„)). (5) 

uiew 


Lemma 1 (Stronger Soft-Covering Lemma) For any Qu, 

Qv\U’ R > I(U;V), where |V| < c», there exist 
71 i 72 > 0 , such that for n large enough 



( 6 ) 


More precisely, for any n G N and S G (O, R — I{U; 17)^ 
p(^P)(p4®'*^||q];^) < (i + |v|")e-^2"^, (7) 


where 


a — 1 


75 = sup - y{R — S — da{Qu,v,QuQv)), ( 8 a) 

1 


i>i 2oi 1 

C 5 = 3 log e -f 2js log 2 -f 2 log 


max 


i;Gsupp((3v) Qv{v) J 


( 8 b) 


and (ic[(r,n) = log 2 / (^) is the Renyi diver¬ 

gence of order a. 


Remark 1 The inequality (H is trivially true for S outside of 
the expressed range. 

The important quantity in the lemma above is 75 , which 
is the exponent that soft-covering achieves. We see in (H that 
the double-exponential convergence of probability occurs with 
exponent i5 > 0. Thus, the best soft-covering exponent that the 
lemma achieves with confidence, over all b > 0 , is 

Q/ - \ 

1* = sup 75 = 7o = sup -- -iR - da{Qu,v,QuQv))- 

5>0 a>l 2a — i 

(9) 

The double-exponential confidence rate S acts as a reduction in 
codebook rate R in the definition of 75 . Consequently, 75 = 0 
for ,5 > i? - I{U] V). 

^Many of the theorems only claim existence of a good codebook, but all 
of the proofs use expected value to establish existence. 
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Remark 2 (Total Variation Exponent of Decay) The 

stronger soft-covering lemma can be reproduced while 
replacing the relative divergence with total variation m- 
Although, relative entropy can be used to bound total 
variation via Pinsker’s inequality, this approach causes a 
loss of a factor of 2 in the exponent of decay. Alternatively, 
the proof of Lemma |7] can be modified to produce the bound 
on the total variation instead of the relative entropy. This 
direct method keeps the error exponents the same for the 
total variation case as it is for relative entropy. 

Before proving Lemma [T] we note that the name ‘stronger 
soft-covering lemma’ is justified because (01 implies that 
the expectation of the relative entropy over the ensemble of 
codebooks decays exponentially fast (i.e., Wyner’s notion of 
soft-covering). This is stated in the following lemma and 
proven in Appendix lAl 


Lemma 2 (Stronger than Wyner’s Soft-Covering Lemma) 

Let 7i, 72 > 0 be such that (0 holds for n large enough, 
then for every such n. 


+n\og(^-^ 

where = min^gsuppCQv) Qv{v) > 0. 


( 10 ) 


function. For every v G 'F", define 

, 




(16a) 

(16b) 




The measures Pb„,i and Pb „,2 on the space "F" are not 
probability measures, but + Pt 3 n ,2 = P^^^ for each 

codebook i3„. We also split into two parts. Namely, for 
every v G "F", we set 

Ag 2(v) = —^^(v). (17b) 

With respect to the above definitions. Lemma 0 states an 
upper bound on the relative entropy of interest. 


Lemma 3 For every fixed codebook Bn, we have 

d(p^^-'>\\q^) <h(^jdPB„,^ 

+ J dF’Bn.i log Ab„_i-I- J _2 log Ab„ _2 , ( 18 ) 


Proof of Lemma Q} We state the proof in terms of 
arbitrary distributions (not necessarily discrete). When needed, 
we will specialize to the case that V is finite. For any fixed 
codebook C„, let the Radon-Nikodym derivative between the 
induced and desired distributions be denoted as 

In the discrete case, this is just a ratio of probability mass 
functions. Accordingly, the relative entropy of interest, which 
is a function of the codebook is given by 


D 



= J dP4^"MogAH„. 


( 12 ) 


To describe the jointly-typical set over u- and u-sequences, 
we first define information density ^, which is a function 
on the space U xV specified by 

iQu.viu^v) = log ■ (13) 

In (foi l, the argument of the logarithm is the Radon-Nikodym 
derivative between Qv\u=u Qv- Let e > 0 be arbitrary, 
to be determined later, and define 




A = |(u,v) GW”xV" 
and note that 

n 

*Q&,v(wv) = 


(14) 

(15) 


We split P^' 


into two parts, making use of the indicator 


where h{-) is the binary entropy function. 


The proof is relegated to Appendix iBl Based on Lemma 0 if 
the relative entropy of interest does not decay exponentially 
fast, then the same is true for the terms on the right-hand side 
(RHS) of (fTsT l. Therefore, to establish Lemma0 its suffices to 
show that the probability (with respect to a random codebook) 
of the RHS not vanishing exponentially fast to 0 as n —oo, 
is double-exponentially small. 

Notice that usually contains almost all of the proba¬ 

bility. That is, for any fixed we have 


J dPBn,2 ~ ^ ~ J 

= ^ 2 -"«PQn,^((u(u;,S„),V) ^ A 




u 


For a random codebook, (O becomes 



dPnr,,2 


^ V) ^ a|u = U(u;,B„)). 

( 20 ) 




The RHS of (l20l) is an average of exponentially many i.i.d. 
random variables bounded between 0 and 1. Furthermore, 
the expected value of each one is the exponentially small 
probability of correlated sequences being atypical: 


El 


Aq^i,((U(u;,B„), V) i a|u = U(u;,B„)) 


= Pqs,v((u,v)m.) 
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'^U,V 


J2^Qu,v{UuVt)>n{l{U-V)+e) 


^u,v — J 

(6) Eon 
V '■‘u,v 

- 2^^(^(U-,V)+e) 

1 2 ^(^('^;^)+'=) I 

(£) 2"^(Tl°g2EQ(,,v, [2^‘0 e/,v <''''">] _/(c/;y)_, 


where (a) is true for any A > 0, (b) is Markov’s inequality, 

(c) follows by restricting A to be strictly positive, while 

(d) is from the definition of the Renyi divergence of order 
A + 1. We use units of bits for mutual information and Renyi 
divergence to coincide with the base two expression of rate. 
Now, substituting a = A + 1 into (12111 gives 


"Qn^i(U(u;,B„),V)^A U = U(u;,B„)) < 

(22a) 


where 


= {a- l)(/(t/; V)+e- d^{Qu,v, QuQv)), (22b) 
for every a > 1 and e > 0, over which we may optimize. 
The optimal choice of e is apparent when all bounds of the 
proof are considered together (some yet to be derived), but the 
formula may seem arbitrary at the moment. Nevertheless, fix 
5 G (O, R — I(U; V)), as found in the theorem statement, and 
set 


_ ^(R — d) + (a — l)da(Qu,v, QuQv) 

“ i TT lA 

2 + (a - 1) 

Substituting into ^ gives 


/([/;!/). 

(23) 


/3a.<5 = /3a,ec,5 = 2 a - 1 ^ - da(Qu,V, QuQv))- (24) 

Observe that £^,5 in ( l23t is nonnegative under the as¬ 
sumption that R — S > I{U]V), because a > 1 and 
da{Quy, QuQv) > di(Qu,v,QuQv) = I{U; V). 

Next, we use the following version of the Chernoff bound 
to bound the probability of (l20l i not being exponentially small. 


Lemma 4 (Chernoff Bound) Let be a collection 

of i.i.d. random variables with Xm G [0, B] and EXm < /r ^ 
0, for all m G\1 : M\ Then for any c with ^ G [1, 2], 

(25) 

\ m=l / 

The proof is given in Appendix |0 

Using (|25ll with M = 2”-R, n = B = 1, and ^ = 

2, assures that J dPBn ,2 is exponentially small with probability 
doubly-exponentially close to 1. That is 

P dPe „,2 > 2 • (26) 


Similarly, Ab„ i is an average of exponentially many i.i.d. 
and uniformly bounded functions, each one determined by one 
sequence in the random codebook: 


l(v) 




dQ 


V|f7=U(u),B„) 

dQy 




For every term in the average, the indicator function bounds 
the value to be between 0 and The expected 

value of each term with respect to the codebook is bounded 
above by one, which is observed by removing the indicator 
function. Therefore, the Chernoff bound assures that Ab„,i is 
exponentially close to one for every v G V". Setting M = 
2 ”^, p = 1, P = and ^ = 1 + 2-"^“.^ into 

(|25]) . gives 

p(AB„,r(v) >1 + 2 -/^-) < ’ 

= VveV", (28) 


which decays doubly-exponentially fast for any <5 > 0. 

At this point, we specialize to a finite set V. Consequently, 
Ab „,2 is bounded as 


Ab„. 2 (v) < ( max 


1 


wGsupp(Qv) Qvi'v) 


Vv e V" 


(29) 


with probability 1. Notice that the maximum is only over 
the support of Qy, which makes this bound finite. The 
underlying reason for this restriction is that with probability 
one a conditional distribution is absolutely continuous with 
respect to its associated marginal distribution. 

Having ( |26] |, ( |28] ) and ( l29b , we can now bound the proba¬ 
bility that the RHS of ( fTSl l is not exponentially small. Let S 
be the set of codebooks S„, such that all of the following are 
true: 


J dPB„,2 <2-2-'^^-'f 

Ai 3 „.l(v) < 

Ai3„.2(v) < 


Ab„,i(v) < 1 -b Vv G V”, 

1 


max 


j;Gsupp(Qv) Qvi'v) 


VvG V" 


(30a) 

(30b) 

(30c) 


First, we use the union bound, while taking advantage of the 
fact that the space V” is only exponentially large, to show 
that the probability of a random codebook not being in S is 
double-exponentially small: 


is) 

(a 


< P^^y dPB„.2 > 2 • 

y] p(Ab„.i(v) > l-b2- 


vGV” 


+ y^ p( Ab„, 2 (v) > ( max ^ ) ] 

\ V’'esupp(Qv) Qv{v) J j 


<63 + |Vr • e 3"( 

•- --■n\„-i 2 "" 


< (l + |Vr)e-3 


(31) 
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where (a) is the union bound, (b) uses ( l26b , ( 1281 ) and ( l29b , 
while (c) follows because / 3 q ,,5 < — 6). 


Next, we claim that for every codebook in S, the RHS of 
(ITSl l is exponentially small. Let Bn € S and consider the 
following. For every x G [0,1], h{x) < x\og^, using which 
( I30al ) implies that 

h = h dPB^,^ 

< 2(loge + /3„.5log2)n2-'^^“'L (32) 

Furthermore, by OObb . we have 

J log Ai3„,l < J log(l + 2“"^“"’) 

< log(l + 2-"^“’^) 


(“) o 

< 2-”'^“'Moge, 


(33) 


where (a) follows since log(l + a;) < a: log e, for every a; > 0 . 
Finally, using (I30cb we obtain 


/ d^’B„. 2 logAB „,2 < / dPB„, 2 log ( max 
J J \j)Gsupp(C 


1 


< 2 log ( max 


pp(Qv) Qviv) 

1 


Desupp(Qv) Qviv) 


(34) 


Combining (l32]) -(l34b. yields 

/l dPB„,i^+ J log Ab„, 1 + j log Ab „,2 

< [ 2 (loge + / 3 a, 5 log 2 )+loge 


+ 2 log max ^ , , 
\«esupp(Qv) Qviv) 

where (a) comes from setting 
Ca .5 A 31oge + 2 / 3 a, 5 log2 + 21og ( max 


(35) 

1 


.Desupp(Qv) Qviv) J 

(36) 

This implies that for all a > 1 and (5 G (O, i? — /(C/; C)), 

< pf/i + y log Ab„,i 


< 

(a) 


is) 


+ y c?Pb „.2 log Ab „,2 > CQ_ 5 n 2 


< (l + |Vr)e-3' 


(37) 


where (a) follows from (OTb . Denoting cs = sup^.^]^ Ca, 5 , (ITTT i 


further gives 



> C 5 n 2 -”^“'^ 


< (l + |Vr)e-32”'. 

(38) 


Since (l38T l is true for all a > 1, it must also be true, with 
strict inequality in the LHS, when replacing / 3„,5 with 


Q/ _ 

75 A sup/ 3 a,5 = sup-- -{R - s - daiQu,v,QuQv)), 

a>l a>l — 1 

(39) 

which is the exponential rate of convergence stated in dS^ that 
we derive for the strong soft-covering lemma. This establishes 
the statement from (|7]i and proves Lemma [T] 

Concluding, if i? > /(?7;C) and for any S G (0,i? — 
iiU;V)), we get exponential convergence of the relative 
entropy at rate 0 ( 2 “'*'^") with doubly-exponential certainty. 
Discarding the precise exponents of convergence and coeffi¬ 
cients, we state that there exist 71,72 > 0 , such that for n 
large enough 



(40) 


IV. Wiretap Channel I 

As a rather simple application of stronger soft-covering 
lemma, we give an alternative derivation of the SS-capacity 
of the WTC I H, HD, im, in. Since the channel to the 
legitimate user is the same in both WTCs I and II, the maximal 
error probability analysis presented here is subsequently used 
to establish reliability for the WTC 11. 

Our direct proof relies on classic wiretap codes and SS 
is established using the union bound while invoking the 
stronger soft-covering lemma. In a wiretap code, a subcode 
is associated with each confidential message. To transmit a 
certain message, a codeword from its subcode is selected 
uniformly at random and transmitted over the channel. Letting 
these subcodes be large enough while noting that the number 
of confidential messages only grows exponentially with the 
blocklength, the union bound and the double-exponential de¬ 
cay the lemma provides show the existence of a semantic ally- 
secure sequence of codes. Using these codes, each transmitted 
message induces an output PMF at the eavesdropper that 
appears i.i.d. and does not depend on the message. 

Wyner’s soft-covering lemma, that is now a standard tool 
for achieving strong-secrecy for the WTC I, comes up short 
in providing SS. The classic soft-covering argument says that 
on average over the messages, the output at the eavesdropper 
will look i.i.d., provided that the size of these subcodes is large 
enough. This can be used to claim that the unnormalized mu¬ 
tual information between the message and the eavesdropper’s 
output is small, thus ensuring strong-secrecy. However, for 
SS, it must be claimed that the output PMF is close the i.i.d. 
distribution for all messages, and there are exponentially many 
messages. Here is where the stronger soft-covering lemma is 
advantageous. 
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Fig. 2. The classic wiretap channel, referred to as the WTC 1. 


A. Problem Definition 

The DM-WTC I is illustrated in Fig.|2] The sender chooses a 
message m from the set [l ; 2"^] and maps it into a sequence 
X € -T" (the mapping may be random). The sequence x is 
transmitted over the DM-WTC I with transition probability 
Qy,z\x- The output sequences y G y'^ and z G Z" are 
observed by the receiver and the eavesdropper, respectively. 
Based on y, the receiver produces an estimate m of m. The 
eavesdropper tries to glean whatever it can about the message 
from z. 

Definition 1 (Code Description) An (n, R) WTC I code 
has: 

1) A message set M. = [l : 2"^]. 

2) A stochastic encoder /i : Ad —>■ 7^(X"). 

3) A decoding function fi : > Ai, where Ad = Ad U 

{e} and e ^ Ad. 

For any message distribution Pm G V{A4), the joint PMF 
over Ad x A”” x A*" x Z" x Ad induced by Pm and an (n, R) 
code Cn is: 

P^^’''>{m,x,y,z,m) = PM{m)f{x.\m) 

(41) 

Definition 2 (Maximal Error Probability) The maximal er¬ 
ror probability of an (n,R) WTC I code Cn is 

e*(Cn) = max em(C„), (42a) 

mGA4 

where 

em{Cn) = ^ /i(x|m) ^ Q”|_,f(y|x). (42b) 

xeX" yGy”: 

0i(y)Am 

Definition 3 (SS Metric) The SS metric associated to an 
(n, R) WTC I code Cn is 0 

Sem(C„) = max (43) 

PM&ViM) 

where Iq^ denotes a mutual information term that is calculated 
with respect to the PMF induced by Cn from (1411) . 

Definition 4 (Semantically-Secure Codes) A sequence of 
{n,R) WTC I codes {Cn}^ is semantically-secure if there 

^Sem(Cn) is actually the mutual-information-security (MIS) metric, which 
is equivalent to SS by (l)- We use the representation in ED rather than 
the formal definition of SS (see, e.g., [2| Equation (4)]) out of analytical 
convenience. 


is a constants 7 > 0 and an uq G N, such that for every 
n > no, Sem(C„) < 

Remark 3 SS requires that a single sequence of codes works 
well for all message PMFs. Accordingly, the mutual informa¬ 
tion term in (EH) is maximized over Pm when the code Cn is 
known. In other words, although not stated explicitly. Pm is 
a function of Cn- 

Remark 4 By Definition^ for a sequence of WTC I codes to 
be semantically-secure, the SS metric from (Il3]) must vanish 
exponentially fast. This is a standard requirement in the 
cryptography community, commonly referred to as strong-SS 
(see, e.g., E\ Section 3.2]). The coding scheme given in the 
direct proof of Theorem Q] achieves this exponential decay of 
the SS-metric (see Section \lV-Clh An exponential decay of the 
strong-secrecy metric was previously observed in 4271? . 42(SI? . 

(im?. 

Definition 5 (SS-Achievability) A rate R G R+ is SS- 

achievable if there is a sequence of (n, R) WTC I semantically- 
secure codes with e*(Cn) —>■ 0 as n —>■ co. 

Definition 6 (SS-Capacity) The SS-capacity of the WTC I, 
C'sem. is the supremum of the set of SS-achievable rates. 


B. Results 

As stated in the following theorem, the SS-capacity of 
the WTC I under a maximal error probability constraint is 
the same as its weak-secrecy-capacity under an average error 
probability constraint. 


Theorem 1 (WTC I SS-Capacity) The SS-capacity of the 
WTC I is 


C'sem = max 
Qu,x- 

U-X-{Y.Z) 


I(U-Y)-I(U-Z) 


(44) 


and one may restrict the cardinality of V to \IA\ < \X\. 


The proof of Theorem [T] is given in Section IIV-CII Our 
achievability proof relies on the stronger soft-covering lemma 
to establish the existence of a sequence of semantically- 
secure codes with a vanishing average probability of error. 
The expurgation technique ll22l Theorem 7.7.1] is then used 
to upgrade the codes to have a vanishing maximal error 
probability. 


Remark 5 The cardinality bound in Theorem Q] was estab¬ 
lished in 4351 Theorem 22.1 ]. 


Remark 6 The direct part of Theorem Q] can also be derived 
without using the stronger soft-covering lemma. Instead, one 
may invoke the codebook expurgation technique twice. By 
removing a certain portion of the messages, any sequence 
of codes that ensures strong-secrecy and a vanishing average 
error probability, can be upgraded to provide SS and reliability 
with respect to the maximal error probability with negligible 
rate-loss. In the original codes, the fraction of messages 

















that induce an error probability greater than three times the 
average, is less than Similarly, the fraction of messages 
with secrecy distance greater than three times the average is 
less than Therefore, the fraction of offending messages is 
less than By removing them one obtains a new sequence of 
codes that is semantically-secure and has a vanishing maximal 
error probability. Finally, the rate of the n-th code in the new 
sequence is R — (here R stands for the rate of the 

original codes), and the loss is negligible for large n. 


Remark 7 The expurgation method is insujficient for estab¬ 
lishing SS for the WTC II because the messages that need to 
be removed might differ from one choice of the eavesdropper’s 
observations to the next. It also does not work in other 
settings such as the multiple access WTC, where expurgation 
is problematic in general. On the other hand, even for that 
setting, an achievability proof that relies on the stronger 
soft-covering lemma goes through by similar steps to those 
presented below. Thus, strong-secrecy can be upgraded to SS 
even in situations where vanishing average error probability 
cannot be upgraded to vanishing maximum error probability 
(via expurgation). 


C. Proofs 

1) TheoremU] For the converse, let be a sequence 

of {n,R) semantically-secure WTC I codes with e*(Cn) 

0. Since both e*(C„) —^ 0 and Sem(C„) —0 hold for any 
message distribution Pm G V{M.), in particular, they hold 
for a uniform Pm- The converse thus follows since Csem in 
(l44l i coincides with the secrecy-capacity of the WTC I under 
a vanishing average error probability criterion and the weak- 
secrecy constraint. 

For the direct part, we first establish the achievability of (l44li 
when U = X. Then, a standard channel prefixing argument 
extends the proof to any U with U — X — Y. 

Fix e > 0, a PMF Qx G P{X), and let M and W 
be independent random variables uniformly distributed over 
A4 and TV = [l : 2"^], respectively. M represents the 
choice of the message, while W stands for the stochastic 
part of the encoder. Thus, we start by imposing a uniform 
distribution over the set of messages and use this to show the 
existence of a semantically-secure sequence of (n, R) codes 
with a vanishing average error probability. Afterwards, the 
uniform message distribution assumption is dropped using 
the expurgation technique ll22l Theorem 7.7.1], which allows 
upgrading reliability to achieve a vanishing maximal error 
probability, while preserving SS. 

Codebook Let B„ be a random codebook 

given by a collection of i.i.d. random vectors 

each distributed 

according to A realization of B„ is denoted by 

with respect to which a 

classic wiretap code is constructed. 

Encoder f^: To send m € Ai the encoder randomly and 
uniformly chooses W = w from W and transmits x(m, w, Bn) 
over the WTC I. 


Decoder (pi: Upon observing y G the decoder searches 
for a unique pair {fh,w) € Ai x W such that 


L(m,w,Bn),y) &TT{Qx,y)- 


(45) 


If such a unique pair is found, then set pi (y) = m; otherwise, 

<^i(y) = e. 

The triple [Ai, fi,pi) defined with respect to the codebook 
Bn constitutes an (n, R) WTC I code C„. When a random 
codebook B„ is used, we denote the corresponding random 
code by C„. 

Average Error Probability Analysis: By standard joint 
typicality arguments we show that the average error probabil¬ 
ity, when expected over the ensemble of codebooks, is arbitrar¬ 
ily small. For every fixed codebook and (rh, w) € Aix W, 
define the event 

£{rh,w,Bn) = {{x{rfi,w,Bn),Y) GTfiQx.Y)}, (46) 

where Y ^ Qy-|x=x(m * H ) random sequence ob¬ 

served at the receiver when the transmitted sends (m,w). We 
have 

' ' mGAt 

= Ec„Pc„ ^ M) 

< Ec„Pc„ ((M, W) ^ (M, W)) 

Ec„Pc„ ((M, W)filff)\M=l,W = 1) 


= Eb„P I f(l, 1,B„)^ U y £{rh,w,Mn) 


(c) 

< Pqj,I 


{X,Y)€Tf(Qx. 


Y 


Pi 


^Qx^Qi 


X,Y)€rfiQx, 


Y 


P2 


(47) 


where (a) uses the symmetry of the codebook construction 
with respect to m and w, (b) follows by the decoding rule, 
while (c) takes the expectation over the ensemble of codebooks 
and uses the union bound. 


By the law of large numbers Pi 
P 2 ^ 0 as n grows provided tha0 


0 as n ^ 00 , while 
R-1 R < I(X;Y). (48) 


Thus, we have 


^Cn 


1 

JM\ 




-> 0 . 


(49) 


Security Analysis: For any fixed (which, in turn, fixed 
Cji), we denote by Pm’’^ joint distribution of M and Z 


^All subsequent mutual information terms in the proof are calculated with 
respect to Qu,xQy,z\x or '1® marginals. 
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induced by the code C„ (see (HTI) ). For any we first have 


max Ic (M;Z) 
PueriM) 

(a) 


=' max 


PueViM) 


Z\M 


P 


(Cn) 


P, 


M 


(b) 

< 


max DI P 


PMe'P(Ai) 


P 


= max 
PMeP(M) 


(^^z\M Qz 

^ P{m)D(p, 


M . 


(C„) 

Z|M=m 


m£At 




< 


max P(m) max d(p^’^} - Qzl 

PM&PiM) ^ meM V z|M=m 


= max D I Pr 


mGAi 


m^A4 

(Cn) 


Z|M=m 




(50) 


where (a) uses the relative entropy chain rule, while is because 
for any Pm G V{M), we have 




DP. 


,(Cn 


Z|M 


P 


(Cn) 


Pi 


M 


mGAi 

= D{pi';;)\\Qz\pM) 

— P(m) P^‘'"^(z|to) log 







= D p: 




,(Cn) 


Z|M 


<p p: 




)(Cn) 


Z|M 




)-4 

4 


Pm) - 
Pm 


»(^n) 




(51) 


Now, let 7 be an arbitrary positive real number to be 
determined later and consider the following probability. 

{Sem(C„) < 


(^). 


p(^|Vm G M, P(P 


(Cn) 

Z|M=m 


= P( 3m G 


M, P(j 

U {^(c 


)(Cn) 


Z|M=m 


Qz) 

Ql) 


< e 


-ny)' 


> e 


(Cn) 

Z\M=r, 


< 


, mGM 


E 'P) 


> e 


3(C„) 

Z|M=ri 


Qz) 

Qz) > e-”^ 




where (a) follows from ( l50b and ( l50l l. 

By the stronger soft-covering lemma, if 
R>I{X;Z), 

then there are 71,72 > such that 


(52) 


’("0 


)(Cn) 

Z|M=ri 


<e 


(53) 


(54) 


for sufficiently large n. Inserting dSli into (I 52 ]) while setting 


7 = 71 , we have 

PNSem(C„) < e-"T'i} J < Y. 

= Vn -^ 0, (55) 

n—^oo 

and therefore, 

pfSem(C„) < > 1 - Vu -^ 1- (56) 

\ / n—^oo 

Inequality (l56l) implies that if R satisfies (l53T l. the probability 
that a randomly generated sequence of codes meets the SS 
criterion for large n is arbitrarily close to 1. In fact, because 
dSll decays so rapidly, the Borel-Cantelli lemma implies 
that almost every sequence of realizations of is 

semantically-secure. 

SS-Achievability: To establish the existence of a sequence 
of (n, 2 "^) reliable and semantically-secure codes 
we reproduce the Selection Lemma ll^ Lemma 2.2]. 


Lemma 5 (Selection Lemma) Let be a sequence 

of random variables, where takes values in An- Let 
{fn^\ f'n \ • ■ •, be a collection of I < 00 sequences 

of bounded functions fn ■ An —>■ R-r. * G [1 : /]. /y 

E/«(A„)-^0, VzG[l:J], (57a) 

n—¥(yD 

then there exists a sequence {an}neN> where an G An for 
every n G N, such that 

/«(a„)-^0, VzG[l:/]. (57b) 

n—¥(yD 


For completeness, the proof of Lemma |3 is given in 
Appendix |D] Applying Lemma |5] to the random vari¬ 
ables and the functions 'EmeM ^rnjCn) and 

l|s while using (1491 ) and (l55l l. we have that 

there is a sequence of {n,R) WTC I codes for 

which 


1 

JM\ 


E' 


^(Cn) 


-^ 0 , 

n—¥oo 


(58a) 


|Sem(Cn)>e ^'>'1 | n^oo 


(58b) 


Since the indicator function in (I58bb takes only the values 0 
and 1, to satisfy the convergence there must exist an ng G N, 
such that 


and therefore, 

Sem(C„) < 6 “"'*'^ Vn > ng. (60) 

The final step is to amend {Cn}„gfj to be reliable with 
respect to the maximal error probability (as defined in (I42al i). 
This is done using the expurgation technique (see, e.g., ll^ 
Theorem 7.7.1]). Namely, we discard the worst half of the 
codewords in each codebook Bn- Denoting the amended 
sequence of codebooks by { 6 n}„gjsj and their corresponding 
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sequence of codes by we have 


e*(C:)-^ 0 . (61) 

n—¥cio 

Note that in each C* there are 2"^“^ codewords, i.e., throwing 
out half the codewords has changed the rate from R to R — 
which is negligible for large n. Further note that because 
semantically-secure, so is Combining 

(HSll with (|53ll, we have that every 


0 < i? < max 
Qx 


I{X-Y)-I{X-Z) 


(62) 


is SS-achievable. 

To establish the achievability of Csem from (l44l i. we prefix 
a DM-channel (DMC) Qx\v to the original WTC I Qy,z\x 
to obtain a new channel Qy,z\v, where 


QY,z\viy’'^\'^) — ^ Qx\vi^\'^)QY,z\xiy’'^\^)- ( 63 ) 


Using a similar analysis as above with respect to Qy,z\v, any 
R € R"*" satisfying 


R < max 
Qu,x- 

U-X-{Y,Z) 


I{U-Y)-I{U-Z) 


(64) 


is achievable. 


V. Wiretap Channel II 

The WTC II scenario considers communication between 
two legitimate parties in the presence of an eavesdropper that 
can choose to observed any subset of the transmitted sequence, 
while being limited in quantity. The challenge in this setting is 
that the eavesdropper knows the codebook when it selects the 
subset to observe. Therefore, secrecy will only be achieved 
if it is achieved uniformly for all selections of packets, of 
which there are exponentially many possibilities. Furthermore, 
SS being our goal, secrecy must be ensured for each one of 
the exponentially many confidential messages. Nonetheless, 
as the combined number of subsets and messages grows only 
exponentially with the blocklenght, using the stronger soft- 
covering lemma we show that rates all the way up to the weak- 
secrecy-capacity of the DM erasure WTC I are achievable even 
in this more stringent setting. Then, we establish the capacity 
of this WTC I as an upper bound on the considered WTC II, 
thus characterizing its SS-capacity. 


A. Problem Definition 

The WTC II is illustrated in Fig. [3] The sender chooses a 
message m from the set [l ; 2"^] and maps it into a sequence 
X G T"" (the mapping may be random). The sequence x is 
transmitted over a point-to-point DMC with transition proba¬ 
bility Qyix- Based on the received channel output sequence 
y € y’\ the receiver produces an estimate m of m. The 
eavesdropper noiselessly observes a subset of its choice of 
the n transmitted symbols. Namely, the eavesdropped chooses 
5 C [1 : n], |5| = p < n, and observes z G (X U {?})", 
where 

Zi = 



(65) 



Fig. 3. The type II wiretap channel. 


Based on z, the eavesdropper tries to learn as much as possible 
about the message. 

With some abuse of notation (reusing notations from Section 
lIV-Ab . we introduce the following definitions. An (n, R) WTC 
II code Cn and the corresponding maximal error probability 
e*(C„) are defined similarly to Definitions [T] and | 2 l respec¬ 
tively. 


Definition 7 (SS Metric) The SS metric with respect to an 
(n, R) WTC II code Cn is 


Sem^(C„) = max /c„(M;Z), ( 66 ) 

PueviM), 

5C[1:ti]: |5|=/i 


where Ic„ denotes that the mutual information term is calcu¬ 
lated with respect to 








Remark 8 Ai explained in Remark\^ the code Cn is known 
when the mutual information term in (1661) is maximized. Thus, 
the observed subset <S C [1 : n] and the message PMF Pm are 
both functions of Cn- Although, for the sake of simplicity, this 
dependence is omitted from our notations, the reader should 
keep in mind that a single codebook is required to works well 
for all choices of subsets and message PMFs. 


Definition 8 (Semanticafly-Secure Codes) Let a G [0,1] 

and p, = \an\, a sequence of (n,R) WTC II codes 
is a-semantically-secure if there is a constants 7 > 0 and an 
no G N, such that for every n > rig, Sem^(C„) < 6 “”'*'. 

Definition 9 (SS-Achievability) Let a G [0,1] and p = 

\a.n\, a rate R G R+ is a-SS-achievable if there is a sequence 
of{n,R) a-semantically-secure WTC II codes with 

e*(Cn) -G Q as n ^ CO. 

Definition 10 (SS-Capacity) For any a G [0,1], the a-SS- 
capacity of the WTC II Csein(ci) A the sup remum of the set 
of a-SS-achievable rates. 


B. Converse 

The following proposition is subsequently used for the 
converse proof of the WTC II SS-capacity. The proposition 
states that the strong-secrecy-capacity of a WTC I with a 
DM-EC to the eavesdropper is an upper bound on the strong- 
secrecy-capacity of the WTC II. To formulate the result. 
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slight modifications of some of the dehnitions from Sections 
IIV-AI and IV-AI are required. Specihcally, we redehne the 
achievable rates for each setting with respect to a strong- 
secrecy requirement (instead of SS). 

Definition 11 (Strong-Secrecy Achievability for WTC 1) 

A rate R S ffi.+ is strong-secrecy-achievable for the WTC I if 
there is a sequence of {n,R) codes {Ci,n}yjgjsj with 

e*(Ci.„) -^0 (67a) 

n—foo 

/c,„(M;Z) -^0, (67b) 

n—foo 

where M is uniformly distributed over the message set A4. 

Definition 12 (Strong-Secrecy Achievability for WTC 11) 

Let a € [0,1] and /i = [anj, a rate R € M+ is a-strong- 
secrecy-achievable for the WTC II if there is a sequence of 
in,R) codes {C 2 ,„}^gj, with 

e*(C 2 ,n) -^0 (68a) 

n—¥oc 

max /c 2 „(M;Z) -0, (68b) 

5C[l:ra]: ’ n->oo 

|5|=/i 

where M is uniformly distributed over the message set A4. 

The strong-secrecy-capacity for both setting is defined as 
the supremum of the set of strong-secrecy-achievable rates. 

Proposition 1 (WTC I Upper Bounds WTC 11) Let a £ 

(0,1] and C^{a) be the a-strong-secrecy-capacityofthe WTC 
II with a main channel Qy\x' UMrf/rermore, let j3 £ [0, a) 
and C\{j3) be the strong-secrecy-capacity of the WTC I with 
transition probability Qy'^z\x ~ Qy^\x^^\X’ tvfiere ^ 

DM-EC with erasure probability (3 = 1 — (5, i.e., 

= 1^’ ^ ’ ^21 e A. (69) 


Theorem 2 (WTC II SS-Capacity) For any a € [0,1], 


C^Sem(^) — 


max 

Qu,x- 


mv) 


aI{U-X) , 


U-X-Y 

and one may restrict the cardinality of U to \U\ < \X\. 


(72) 


The converse and direct parts of Theorem |2] are established 
in Sections [V-C2l and lV-C3l respectively. As oppose to the SS- 
capacity of the WTC I (where achievability may be derived 
without using Lemma [1]- see Remark |6]l, for the WTC II, the 
stronger soft-covering lemma is essential for the direct proof. 
Specifically, via the union bound, the double-exponential de¬ 
cay that Lemma [T] provides is leveraged to show the existence 
of a sequence of codes that satisfies the vanishing information 
leakage requirement for all choices of S and Pm- 


Remark 9 (Generalized WTC II SS-Capacity) The proof 
of Theorem \2} is robust and readily extends to a more general 
setting where the eavesdropper’s observed symbols are cor¬ 
rupted by random noise. Specifically, we refer to the scenario 
where the eavesdropper first chooses a subset of indices 
S £ [1 : n] of size p = [anJ, then x.^ is passed through 
a DMC Qzjx the eavesdropper receives Zi ^ Qz\x=xi’ 
for i £ S, and Z =? otherwise. The a-SS-capacity for this 
case is 


^(Noisy) 

^Sem 


(a) 


max 

Qu,x. 

U-X-(Y,Z) 


I{U-Y) 


aI{U-Z) , 


(73) 


and recovers (172b by setting Z = X. Both the direct and the 
converse proofs of ( 173b follow by a verbatim repetition of the 
arguments from Section W-C\ with two minor changes. First for 
the converse, the classic DM-EC from Proposition\I\(proven 
in IV-C7b is replaced with a cascade of the DM-EC and the 
DMC Qz\x- Second, for the SS analysis in the direct proof 
(Section \V-C3\l we replace the rate bound from (II 10b with 
R > aI{U]Z) (the reliability analysis goes through without 
changes). 


Then 




max 

Qu,x- 

U-X-Y 


I{U-,Y)-I3I{U-,X) 


Remark 10 The cardinality bound in Theorem \2} is estab- 
(70) lished using the convex cover method 4351 Appendix C]. The 
details are omitted. 


See Section IV-Cll for the proof. Proposition [T] is subse¬ 
quently combined with the following lemma to to establish 
the converse for the a-SS-capacity of the WTC II. 


Lemma 6 (Continuity of WTC I Capacity) Ai a function 

of P, 


C's(/3) = max 

Wu,x'- 


I{U-Y)-PI{U-X) 


(71) 


U-X-Y 


is continues inside (0,1). 


The proof of Lemma|6]is relegated to Appendix]^ The SS- 
capacity of the WTC II with a noisy main channel is stated 
next. 


Remark 11 Theorem^recovers the achievability result from 
/ID Equation (7)] by setting U = X and taking X to be 
uniformly distributed over X. Furthermore, in secrecy was 
established while assuming a uniform distribution over the 
message set, i.e., on average over the messages. Although we 
require security with respect to a stricter metric (SS versus 
weak-secrecy), we achieve higher rates than m Equation (7)] 
and show their optimality. Moreover, to achieve (172b . we use 
classic wiretap codes and establish SS using the stronger soft- 
covering lemma, making the (rather convoluted) coset coding 
scheme from (inspired by /(T^) no longer required. 


C. Proofs 
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1) FropositionU} The equality in (iTOl i follows by evaluating 
the strong-secrecy-capacity formula of a general WTC I, i.e.. 


max 

Qu,x- 

U-X-{Y,Z) 


I{U-Y)-I{U-Z) 


(74) 


n > no, 

|5|=/. 

For every z £ Z", where Z = A" U {?}, define 


(79) 


for the case where the transition probability matrix is 
Qy\\x ~ Qy\x^'z\x- ^ ^ Ber(/3) be a random variable, 
such that its i.i.d. samples define the erasure process of 
the DM-EC with erasure probability /3. Accordingly, $ is 
independent of X and 


Z = 


X, $ = 0 

?, $ = 1 


(75) 


First note that T* is determined by Z since 1 ^ X. 
Combining this with the Markov relation U — X — {Y, Z) 
implies that the chain U — X — {Y, Z, <i>) is also Markov. 
Along with the independence of X and <i>, this implies that 
U and $ are also independent. Consequently, for every Qu,x, 
where U — X — {Y,Z) forms a Markov chain, we have 

I{U-Z) ^^I{U;^,Z) 

= /((/; Z|$) 

pi{U-X) + pi{U-l) 

= l3IiU;X), (76) 


where (a) follows since $ is defined by Z, while (b) and (c) 
follows by the independence of $ and U. Since (l76l l holds for 
every Qu,x as above, we conclude that 


C^(/3) 


max 

Qu,x- 

U-X-Y 


I{U-Y)-PI{U-X) 


(77) 


A{z) = {i £ [1 : n]\zi =?}, 


and let 0(Z) be 


0(Z) = 1 


{M(Z)|<ra«l}- 


(80) 

(81) 


Namely, 0 indicates if the number of erasures in a sequence 
z £ Z” is greater than or equal to \an] or not. 

By conditioning the mutual information term from (iTSl l on 
0(Zi), we distinguish between the two cases of Zi being 
better or worse than Z 2 in terms of the number of erased 
symbols. When 0(Zi) = 0, i.e., Zi is worse that Z 2 , security 
for the WTC I is ensured since {C 2 ,n}„gjij achieve security for 
the WTC II. Otherwise, for the case that 0(Zi) = 1, where 
Zi is better than Z 2 , we use Sanov’s Theorem to show that the 
probability of such an event exponentially decreases with the 
blocklength n, while the mutual information grows linearly at 
most. For any n £ N, we have 


Zi) 


^=^/c,,„(M;0(Zi),Zi) 


7c2,„(m;Zi 
p(0(Zi) =0 

0(Zr) 

V;Zi 

p(0(Zi)=l^ 

X 

0 

M;Zi 


h 


0(Zi)=o) 


0(Zi)=l), 

(82) 


To prove the inequality in (iTOl) . we show that for any a £ 
( 0 , 1 ] and /3 £ [ 0 ,a), an a-strong-secrecy-achievable rate for 
the WTC II is also achievable for the WTC I with erasure 
probability 0. 

Fix a, P as above and let R £ R+ be an a-strong-secrecy- 
achievable rate for the WTC II. Furthermore, let {C 2 ,n} 
be the corresponding sequence of (n, R) codes satisfying ( I 68 I 1 . 
Since the channel to the legitimate receiver and the definition 
of the maximal error probability are the same for both versions 
of the WTC (see (I67al i and (I 68 ab l. {C 2 ,n}„gjsj is also reliable 
when using it to transmit over the WTC I. Therefore, to 
establish ( iTOt . it suffices to show that for every e > 0, there 
is an n* £ N, such that for every n> n* 


where (a) is because 0(Zi) is a function of Zi, while (b) 
follows since the number of erasures in the output sequence 
of a DM-EC is defined by an i.i.d. process that is independent 
of the message. 

For lo, taking any n > uq, (|79|l implies that 


7c2 „ Zi 0(Zi) = o') < max Ic^ „ 

’ \ / »SCfl:nl: ’ 


5C[l:n]: 

\s\=fi 


To upper bound Ii, first note that 


(M;Z2)<|. 

(83) 


7c2.„ {m- Zi 


0(Zi) = 1 ) <nlog(|A’| + l), 


(84) 


holds for every n £ N. Now, fix any S £ (/3,a); there exists 
an ni{6) £ N, such that for all n > ni 


/c,,„(M;Zi) <e, (78) 

where Zi denoted the channel output sequences observed by 
the eavesdroppers of the WTC I. In other words, we show that 
the sequence of codes {C 2 ,ra}^gp^, designed to achieve strong- 
secrecy for the WTC II, also achieves strong-secrecy for the 
WTC I. 

Fet Z 2 be the channel output observed by the eavesdroppers 
of the WTC II, fix e > 0 and let no £ N be such that for every 


[an] < Sn < /3n. (85) 

Thus, for every n > ni(S) Sanov’s Theorem ll22l Theorem 
11.4.1] implies 

p(0(Zi)=l) <p(|,4(Zi)| <Sn) < + 

( 86 ) 

where Di,{S,P) = alog(Y^) -f blog('5//3) is the relative 
entropy between the PMFs of two binary random variables 
distributed according to Ber(b) and Ber(/3), respectively. Since 
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S ^ [3, we have that Di,{6,l3) > 0, and therefore, there is an 
ni((5) < 712 S N, such that for every n > n 2 , 

l 2 <{n + if ■ • n log (IA’l + 1) < |. (87) 

Set n* = max{no,n 2 }. Based on dSST l and dSTl i. for every 
n > n*, we have 


^C 2 ,r. 2i) =^Q +Ti < e, (88) 


which completes the proof. 

2) Theorem \2\ - Converse: For the converse, we first show 
that with respect to the notations used in Proposition [T] 


Ciia) < CUa) 


max 

Qu,x- 

U-X-Y 


I{U-Y) 


aI{U;X) , 


(89) 


for any a G [0,1]. For a = 0,1, the relation is straightforward 
as 


C'l(O) = max/(X;y) = C'^^(O) (90a) 

Qx 

C|(1)=0 = C^I(1). (90b) 


For a S (0,1), (|89] | is established by relying on Proposition 
[1] and the continuity argument from Lemma |6] Namely, by 
taking the limit of ( fTOl i as /3 t ct establishes ( |89] |. 

Having this, the converse follows by arguments similar to 
those presented in Section IIV-CII Fix a G [0,1] and let 
a G M+ be an a-SS-achievable rate for the WTC II and 
be its corresponding {n,R) sequence of codes. By 
the definitions in (142al l and dbbl l. are reliable and 

a-semantically-secure for every message distribution, and in 
particular, for a uniform message distribution. This implies 


C’sem(a) < C'^^(a) < 


max 

Qu,x- 

U-X-Y 




aI{U;X) 


(91) 


and completes the converse proof. 


Remark 12 Our converse proof relies on the achievability 
being defined in terms of a limit as n ^ oo (see Definition 
m- Namely, we show that in the limit, the eavesdropper in 
the WTC I setting is likely to be within a slightly higher 
channel-observation budget than this of the WTC II, which 
by continuity won’t result in much extra rate. The chance of 
having too many channel observations is too small to provide 
non-negligible extra information. If, however, the blocklength 
n can be chosen as a design parameter, then it may be possible 
that a finite n results in a higher achievable secrecy-rate. For 
instance, notice that the optimal code of length 2n in not 
necessarily better than the optimal code of length n, since 
when the blocklenght is longer the eavesdropper has more 
flexibility in choosing his observations. 

3) Theorem^- Direct Part: As before, we start by showing 
the achievability of when U = X. After doing so, we use 
channel prefixing to extend the proof to any U with U—X—Y. 

Fix a G [0,1], e > 0 and a PMF Qx on X. Letting M 
and W be independent random variables uniformly distributed 
over Ai and W = [1 : 2"^], respectively, we repeat the code 
construction from Section IIV-CII A similar analysis of the 


average error probability shows that if 
R + R< I{X-Y), 

then ^ 


' ' meM 


-GO, 


(92) 

(93) 


where Cn is the random code that corresponds to the random 
codebook ]B„. 

Security Analysis: Fix <S C [1 : n] with |5| = = [anj, 

recall that Z = XLl{7} and define the following PMF on Z", 

r^^)(z) = n \{Yz{zf, Vz e z", (94) 

where Xz is the average output PMF of the identity DMC on 
A, i.e.. 


Yz{z) = Qxix)t[z=x} = 


Qxiz), z G X 




0 , 


2 =? 


(95) 


For any Cn (defined by fixing Bn) and Pm G V{AA), the 
relative entropy chain rule implies 


/c„(M;Z) = Z9(p(^^) 


= Dip, 




,(Cn,5) 


ZIM 


rt{Cn 7 * 5 ) 

z 


M , 


-DP 




,(C„.5) 


Pm 
(S) 




(96) 


and therefore 

max Ic {M;7i) 
PsteViM) " 


< max D (Pr 


Pm GP(At) 






ZIM 


AS) 


p 


M 


< 


max P{m) max D 

PmGP{M) ^ meAt V Z!M=m Z J 


= max D I Pr 


m^Ai 


(C„.5) 


Z|M=m 


45) 




(97) 


For any % f A G [1 : n] and z G Z”, recall that z^ = 
{zi)ieA is the sub-vector of z indexed by the elements of A. 
The relative entropy chain rule further simplifies the RHS of 


D Pr 


(Cn.S) 


Z\M=r. 


45)' 


any m 

G M 

= ^( 

JjiTrt 

= D( 

p(C„ 

f^ZS\. 

xd{ 

^ZSf 

‘'^D 

(p£- 


(pf 


Z'S,Z‘S'=|M=m 


45) 

Z'S.Z'S' 


45)^ 

ZS , 


45) 

Z^^ 


o(C„,5) > 

Z^\M—m j 


(Cn,5) 


^zs) 

'Tf ^'' 

-^z I ; 


(98) 


where (a) is because PfsYfA s = 1 r , r. 1 

for every z'^ G Zl‘^1, and (b) follows from ( |94] |. 
Combining (l96b-(l98ll, we have that for every C„ and S C 
[1 : n], with |5| = p = \an\. 


max 

PmS:V{M) V 


Pr 


(C„.S) 


Pm\ 
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< max D (p^sYm- 
meM V ^ I"- 




(99) 


In particular, ( |99] | also holds when maximizing over the substes 
S, which gives 


Sem^(C„)< 

SC[l:n]: |5|=/J 


7"/^ 

-^z 


( 100 ) 


Having (llOOl i. let 8 be an arbitrary positive real number to 
be determined later and consider the following probability. 


■({ 


Sem^(C„) < e 


-nij' 


max D (^Pr 


(a) 

< 


PMev{M), 
'5C[l:n]: |5|=/i 


max 
meAl, 
^5C[l:n]: 


(C„,5) 

Z|M 


P. 


(C„.S) 


Pm ) > e 


— nS 




DP. 


,(C„,5) 


(b) 

< 


u {^( 

. m€A4, 

\5C[l:n]: |5 |=m 

-> C AyJ ' 


Z'S|M=m 


,(C„.5) 

ZS\M=m 


>e 


—n6 




> e 


— n6 


m^Ai 
5C[l:n]: \S\=ii 


,(C„.S) 

Z^\M=m 


I^z) 


> e 


—nS 


( 101 ) 


where (a) uses (1 100b . and (b) is the union bound. 

Each term in the sum on the RHS of dlOlb falls into the 
framework of the stronger soft-covering lemma, with respect 
to a blocklength of /i and the identity channel. Noting that 
|W| = 2”^ = 2^*^ , we have that as long as 


nR 


> HiX), 


( 102 ) 




> 1 < 


there exist $ 1,82 > 0 that for sufficiently large n satisfy 

D ( 

YzS\M=m 

Since /r = [anj < an, taking 

R > aH{X), 

is sufficient to satisfy (1102b for every n G N. 

Setting (5 = i5i and plugging (1103b into (llOlb . gives 

p(^{Sem^(C„) 


(103) 


(104) 


< 


E 


<SC[l:n]: 


,n onR 


0 . 


< 2 

A 

= Kn 


Invoking Lemma |3 once more, we have that if ( |9^ and 
(fT02l) are satisfied, then there is a sequence of (n, R) a- 
semantically-secure codes with 

e^iCr,) -^0. (105) 

\JVl\ n—b-oo 

' ' m^M 


The pruning argument from Section IIV-CII again upgrades 
to be reliable with respect to the maximal error 
probability. Combining ( l92b and (1102b shows the achievability 
of 


R < max 
Qx - 


I{X-,Y) - aH{X) 


(106) 


Finally, we prefix a DMC Qx\u to the original WTC II to 
obtain a new main channel Qy\u^ given by 

Qviuiyl^) — E Qx\ui^\^)QY\xi'y\^)- (i07) 




Furthermore, F^^ from ( l94b is redefined 


as 


r^^)(z) = n n 


(108) 


where Qz is given by 

Qz{z') = 'y ^ Qu{^)Qx\uip^\^')^{z—x'\ 

{u,x)^lAxX 

^ ij 2 uGvQu{u)Qx\uiz\u), z&X 

\o, ^=? 

Repeating a similar analysis as above shows that reliability is 
achieved if 

R + R<I{U;Y), (109) 

while the rate needed for the stronger soft-covering lemma is 

R>aI(U;X). (110) 

Putting dlOOb - dllOb together yields that any rate R € M+ 
satisfying 


R < max 
Qu,x. 
U-X-Y 


I{U;Y) -aI{U]X) 


( 111 ) 


is strongly a-SS-achievable and concludes the proof. 


VI. Summary and Concluding Remarks 

We derived the SS capacity of the WTC II with a noisy 
main channel. The SS metric ensures that the unnormalized 
mutual information between the message and the eavesdrop¬ 
per’s observation is arbitrarily small, even when maximized 
over all message distributions and all possible choices of 
the eavesdropper’s observation. The main tool used in the 
direct proof is a novel and stronger version of Wyner’s soft 
covering lemma, that states that a random codebook achieves 
the soft-covering phenomenon with high probability as long 
as its rate is higher than the mutual information between the 
input and output of the DMC. Furthermore, the probability of 
failure is doubly-exponentially small in the blocklength, thus 
making the lemma advantageous in proving the existence of 
codebooks that satisfy exponentially many constraints. A code 
that achieves SS for the considered WTC II should do just that. 

The SS capacity was achieved by using classic Wyner’s 
wiretap codes. Since the combined number of messages and 
subsets grows only exponentially with the blocklength, SS 
was established by applying the union bound and invoking 
the stronger soft-covering lemma. The direct proof showed 
that rates up to the weak-secrecy capacity of the WTC I with 


n—¥<x> 
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a DM-EC to the eavesdropper are achievable. The converse 
followed by showing that the capacity of this WTC I is an 
upper bound on the SS capacity of the WTC II. 

As a preliminary and simple application of the stronger soft- 
covering lemma, it was used to achieve SS for the WTC I. 
A main goal in doing so was to emphasize the advantage 
of this approach over other methods for achieving SS for 
this scenario, such as the expurgation technique. While the 
expurgation method fails to generalize to some multiuser 
settings, such as the multiple access WTC, an achievability 
proof that relies on the stronger soft-covering lemma goes 
through by similar steps to those presented here. Thus making 
the stronger soft-covering lemma a tool by which the common 
weak-secrecy and strong-secrecy results can be upgraded to 
SS. Furthermore, the lemma might prove useful in any other 
scenario in which performance is measures with respect to an 
exponential number of constraints. 
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Appendix A 
Prooe of Lemma|2] 

Let no G N be such that (| 6 ]l holds for any n > uq. For 
these values of n we have 


Er. d{ P-, 




= 


D 


(a) 


(-Pv (^{D(p.^“">||Q")<e-’‘Ti} 

,nlog(^)F(f.(p^)||Q^,) 


> e" 


{b) 

< e -I- n log 


( 112 ) 

where (a) follows because for every fixed Bn 

< nlog ( — ) , 

VMv/ 

and = min„gsupp(Qv) Qv{v), while (b) follows from (| 6 l). 

Appendix B 
Proof of Lemma[3] 

Fix a codebook C„ and define 


O _ -f] 

“ {(u(W,B„),v)^A.} 

Note that for 9 = 1,2, we have 


-f 1. 


(113: 


wGW 


—nR/^n 

V\U=u{w,Bn) 


I ^ 

{e=2}n{(u(2ii,H„),v)^A.} 

= J dPB^,e. (114) 

and consequently, for every measurable ^ C V", 

(v € ,4,0 = 0 ) = (v G 4l) 


= / dPiSy ,e ■ 


(115) 


For simplicity of notation, denote p(f"^ = P, Pb„,i — Pi, 
Pb „,2 — P 2 , Qv — Q P 0 ®"^ — Fe, and consider 


D(P||<3) = /<iPl„g(^) 


(a) 


dP 




' dQ 


dP 


dQ 


J dQEre 

(c) r 

< / dQ Epe 


1 


dPe 


1 


_ ^ 

re(0) ■ dQ J 


_re(0) dQ 

X log ^Epe 
1 dPe 

r0(0) 

f 1 dPe\ 
"" °®Vre(0) ■ dQ ) 




0 = 1,2 


X log 


1 dPg 
re(0) ■ dQ 


E log 

0 = 1,2 




dPg 




dPi 


E / dPe log AB„^,g, (116) 

0 = 1 , 2 ^ 


where: 

(a) follows since for any two measures /r, A with /r ^ A and 
a /i— integrable function g, we have 

J gdg = J g^dX] (117) 

(b) follows from (II151) and the law of total probability; 

(c) follows by applying Jensens inequality to the convex 
function x i->- a:log(a;); 

(d) follows by the properties of the logarithm and ( II171 ); 

(e) follows from (II141 ) and the definition of Ag„ g, for 
0 = 1, 2, in ([nl). 
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Appendix C 

Proof of the Chenoff Bound - Lemma|4] 

Let X have the same distribution as Xi. For any A > 0, 
we have 


M 


and —>■ 0 as n —>■ oo. We have 

I 


i=l 


1 ^ \ (“) 
^ > C < ■ 


M 


o\cM 


I 

Mfn ^ (An) 


f Ee 


AX 


M 


V 


(b) 1 + ^-g=^EX 

1 £iAc 


(b) 

< 


M 


(I + l)^n 
I 

(7+T) 


< 1 . 


(124) 


< 


1 + T/^ 

oXc 


M 


n^N 


— 1 „Ac 


= e ''b ( 113 ) 

where ^a) is the Chernoff bound, (b) uses the fact that e\x < 

r_ . . ... 


e^°-l 
B 

1 + a; < e“. 


1 + for X € [0,i3] due to convexity, and (c) 


Optimizing the RHS of (IllSl l over A given A* = -5 In ^ 
as the minimizes as long as - > 1. Plugging this into (1118b 
yields 


M 


1 -- \ _ Mm 




M 


m—1 




Therefore, there exists a realization {a„}„gN of {An} 
such that 

f}{\an)<{I + l)Sn = Sn, VzG[ 1 :/], n€N. (125) 
Since I < 00 independently of n, we have (5„ —>■ 0 as n —?> cx). 

Appendix E 
Proof of Lemma|6] 

We prove the continuity of Cg(f3) inside (0,1) by showing 
that it is bounded and convex. Let Pi, 132 € (0,1), A G [0,1] 
and observe that 


V^>L 
(119) 

This is a good bound when /i ^ i3, as it is in our case. If 
c//i is shrinking, then to further simplify the bound consider 
the third order Tayler expansion of a: (In x — 1) about x = 1, 

x(lnx — 1) + 1 > ^(x — 1)^ — i(x — 1)^, Vx > 1. (120) 


CliXPi+Xp2) 

= max 
Qu,x- 
U-X-Y 

< X max 
Qu,x' 
U-X-Y 


(A + A)/([/; Y) - (XPi + XP2)I(U- X) 
I(U-,Y)-PiI(U-X) 


+ A max 
Qu,x-. 
U-X-Y 


= XCl(Pi) + XCl{P 2 ). 

Furthermore, for every P G (0,1), 


I{U-,Y)-P2l(U-X) 

(126) 


The LHS in (fHOl i is a lower bound because the fourth 
derivative is positive for all x > 1. Furthermore, if x — 1 < 1, 
we have 

i(x - 1)^ - i(x - 1)^ > i(x - 1)^, VxG[1,2]. (121) 

2 b 6 

Putting it all together gives 

/ 1 at \ 

1 1 ^ \ M M C r T 

<e-^U-i) , V-g[1,2]. 


C\(P) < max/( 2 f;y) < log 13^1 < cxd. (127) 
Qx 


( 122 ) 


Appendix D 
Proof of Lemma|5] 

Since * e [1 : I], are bounded and by (I57ab . 

there exists a sequence such that 

EUHAn)<5n, VzG[l:/], nGN, 


(123) 
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